# Low-resource Inference
Magtie V1 12B GGUF
Apache-2.0
A collection of GGUF quantized versions of MagTie-v1-12B, created by merging pre-trained language models using mergekit, suitable for text generation tasks.
Large Language Model
Transformers

M
grimjim
373
2
Qwen Qwen3 8B GGUF
Apache-2.0
GGUF format quantized version of Qwen3-8B, provided by TensorBlock, compatible with llama.cpp
Large Language Model
Q
tensorblock
452
1
Deepseek R1 GGUF UD
MIT
DeepSeek-R1 is an efficient large language model that employs Unsloth Dynamic v2.0 quantization technology to achieve outstanding accuracy.
Large Language Model English
D
unsloth
3,149
11
Orpheus 3b Kaya Q2 K.gguf
Apache-2.0
A text-to-speech model fine-tuned from Canopy Labs' pre-trained model, supporting English, using GGUF Q2_K quantization format for efficient inference
Speech Synthesis Supports Multiple Languages
O
lex-au
39
0
Meta Llama Llama 4 Scout 17B 16E Instruct Old GGUF
Other
Llama-4-Scout-17B-16E-Instruct is a 17B parameter instruction fine-tuned large language model released by Meta, which has undergone quantization processing to improve operational efficiency.
Large Language Model Supports Multiple Languages
M
bartowski
3,142
30
Gemma 3 4b It Abliterated Q4 0 GGUF
This model is a GGUF format conversion of mlabonne/gemma-3-4b-it-abliterated, combined with the visual component of x-ray_alpha for a smoother multimodal experience.
Image-to-Text
G
BernTheCreator
160
1
Gemma 3 4b It Q4 0
Gemma 3 4B Instruct is a 4-billion-parameter large language model developed by Google, focusing on text generation and comprehension tasks.
Large Language Model
G
danchev
24
0
Doge 120M MoE Instruct
Apache-2.0
The Doge model employs dynamic masked attention mechanisms for sequence transformation and can use multi-layer perceptrons or cross-domain mixture of experts for state transitions.
Large Language Model
Transformers English

D
SmallDoge
240
1
Bge Reranker Base Q4 K M GGUF
MIT
GGUF format re-ranking model converted from BAAI/bge-reranker-base, supporting Chinese and English text sorting tasks
Text Embedding Supports Multiple Languages
B
sabafallah
44
0
Thedrummer Fallen Gemma3 4B V1 GGUF
Other
This is a quantized version of TheDrummer/Fallen-Gemma3-4B-v1 model, processed using llama.cpp, suitable for text generation tasks.
Large Language Model
T
bartowski
2,106
3
Gemmax2 28 9B V0.1 Q2 K GGUF
GemmaX2-28-9B-v0.1-Q2_K-GGUF is a GGUF format model converted from ModelSpace/GemmaX2-28-9B-v0.1, supporting multilingual translation tasks.
Large Language Model Supports Multiple Languages
G
Gemini
151
1
Qwen2.5 Bakeneko 32b Instruct V2 Gguf
Apache-2.0
This is a quantized version of rinna/qwen2.5-bakeneko-32b-instruct-v2 using llama.cpp, compatible with various llama.cpp-based applications.
Large Language Model Japanese
Q
rinna
597
5
Gemma 3 4b It Q4 K M GGUF
Gemma 3.4B IT is an open-source large language model developed by Google. This version is the 4-bit quantized version converted to GGUF format via llama.cpp.
Large Language Model
G
DravenBlack
186
1
Google.gemma 3 4b It GGUF
Gemma 3.4B IT is a 3.4 billion parameter large language model developed by Google, focusing on the instruction-tuned version, suitable for various natural language processing tasks.
Large Language Model
G
DevQuasar
141
1
Text Summarization Q8 0 GGUF
Apache-2.0
This model is a GGUF-format text summarization model converted from Falconsai/text_summarization, designed for efficient inference via llama.cpp.
Text Generation English
T
vynride
22
0
Nousresearch DeepHermes 3 Llama 3 8B Preview GGUF
A dialogue model fine-tuned based on Llama-3-8B, supporting multiple quantization versions, suitable for tasks such as chatting, reasoning, and role-playing.
Large Language Model English
N
bartowski
1,038
16
Archaeo 12B
A fusion model specifically designed for role-playing and creative writing, combining Rei-12B and Francois-Huali-12B via Slerp algorithm
Large Language Model
Transformers

A
Delta-Vector
168
12
Llama 3.1 0x Mini Q8 0 GGUF
This is a GGUF format model converted from ozone-ai/llama-3.1-0x-mini, suitable for the llama.cpp framework.
Large Language Model
L
NikolayKozloff
19
1
Senecallm X Qwen2.5 7B CyberSecurity Q8 0 GGUF
MIT
This is a large language model for the cybersecurity domain based on the Qwen2.5-7B architecture, converted to GGUF format for use with llama.cpp.
Large Language Model English
S
Nekuromento
18
1
Mistral Portuguese Luana 7b Mental Health Q5 K M GGUF PTBR
Apache-2.0
This is a Portuguese-based Mistral model, specifically fine-tuned for the mental health domain, suitable for Portuguese text generation tasks.
Large Language Model Other
M
noxinc
22
2
Suzume Llama 3 8B Multilingual
Other
Suzume 8B is a multilingual fine-tuned version based on Llama 3, trained on nearly 90,000 multilingual dialogues to enhance multilingual communication capabilities while maintaining Llama 3's intelligence level.
Large Language Model
Transformers

S
lightblue
9,494
112
Saiga Llama3 8b
Other
A Russian chat assistant based on Llama-3 8B Instruct, specially trained to provide Russian dialogue support.
Large Language Model
Transformers Other

S
IlyaGusev
25.29k
123
Percival 01 7b Slerp
Apache-2.0
Percival_01-7b-slerp is a 7B-parameter large language model ranked second on the OPENLLM leaderboard, obtained by merging the liminerity/M7-7b and Gille/StrangeMerges_32-7B-slerp models using the LazyMergekit tool.
Large Language Model
Transformers

P
AurelPx
24
4
Saul Instruct V1 GGUF
MIT
Saul-Instruct-v1-GGUF is the GGUF format version of the Equall/Saul-Instruct-v1 model, suitable for text generation tasks and supports multiple quantization levels.
Large Language Model English
S
MaziyarPanahi
456
8
Tinymixtral 4x248M MoE
Apache-2.0
TinyMixtral-4x248M-MoE is a small language model adopting the Mixture of Experts (MoE) architecture, formed by merging multiple TinyMistral variants, suitable for text generation tasks.
Large Language Model
Transformers

T
Isotonic
1,310
2
Tinymistral 6x248M Instruct
Apache-2.0
A language model fine-tuned based on the Mixture of Experts (MoE) architecture, which fuses multiple models through the LazyMergekit framework and performs excellently in instruction tasks.
Large Language Model
Transformers English

T
M4-ai
1,932
9
Tinymistral 6x248M
Apache-2.0
TinyMistral-6x248M is a Mixture of Experts system that integrates 6 TinyMistral variants using the LazyMergekit tool, pre-trained on the nampdn-ai/mini-peS2o dataset
Large Language Model
Transformers

T
M4-ai
51
14
Laser Dolphin Mixtral 2x7b Dpo
Apache-2.0
A medium-scale Mixture of Experts (MoE) implementation based on Dolphin-2.6-Mistral-7B-DPO-Laser, with an average performance improvement of approximately 1 point in evaluations
Large Language Model
Transformers

L
macadeliccc
133
57
Beyonder 4x7B V2
Other
Beyonder-4x7B-v2 is a large language model based on the Mixture of Experts (MoE) architecture, consisting of 4 expert modules, each specializing in different domains such as dialogue, programming, creative writing, and mathematical reasoning.
Large Language Model
Transformers

B
mlabonne
758
130
Mamba 1B
Apache-2.0
Mamba-1B is a 1B-parameter language model based on the Mamba architecture, supporting English text generation tasks.
Large Language Model
Transformers English

M
Q-bert
185
28
Swallow 70B Instruct GGUF
Swallow 70B Instruct is a powerful language model that provides model files in GGUF format, supports multiple clients and libraries, and can meet text generation needs in different scenarios.
Large Language Model
Transformers Supports Multiple Languages

S
TheBloke
366
9
Dolphin 2.5 Mixtral 8x7b GPTQ
Apache-2.0
Dolphin 2.5 Mixtral 8X7B is a large language model developed by Eric Hartford based on the Mixtral architecture, fine-tuned on multiple high-quality datasets, suitable for various natural language processing tasks.
Large Language Model
Transformers English

D
TheBloke
164
112
Causallm 7B GGUF
CausalLM 7B is a multilingual large language model based on the Llama 2 architecture, supporting English and Chinese text generation tasks.
Large Language Model Supports Multiple Languages
C
TheBloke
2,776
60
Jellyfish 13B
Jellyfish-13B is a 13-billion-parameter large language model specifically customized for data preprocessing tasks, including error detection, data imputation, pattern matching, and entity matching.
Large Language Model
Transformers English

J
NECOUDBFM
102
24
Whisper Large Onnx Int4 Inc
Apache-2.0
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. This repository provides the Whisper large model in ONNX format with INT4 weight quantization, powered by Intel® Neural Compressor and Intel® Transformers Extension.
Speech Recognition
Transformers

W
Intel
44
8
Mythomax L2 13B AWQ
Other
The AWQ quantized version of MythoMax L2 13B, which can effectively improve inference efficiency.
Large Language Model
Transformers English

M
TheBloke
1,555
11
Mythalion 13B GGUF
Mythalion 13B is a 13B-parameter large language model developed by PygmalionAI, based on the Llama architecture, specializing in text generation and instruction-following tasks.
Large Language Model English
M
TheBloke
2,609
67
Llama 2 13B GGUF
Llama 2 is an open-source large language model series developed by Meta. The 13B version is a medium-scale model suitable for various text generation tasks.
Large Language Model English
L
TheBloke
2,399
63
Codellama Chat 13b Chinese
Openrail
CodeLlama is a model specifically designed for code assistance, excelling in handling programming-related Q&A and supporting multi-turn dialogues in Chinese and English.
Large Language Model
Transformers Supports Multiple Languages

C
shareAI
16
21
Pygmalion 6b 4bit 128g
Openrail
A 4-bit GPTQ quantized model based on Pygmalion-6B, suitable for dialogue generation tasks, supporting English text generation
Large Language Model
Transformers English

P
mayaeary
40
40
- 1
- 2
Featured Recommended AI Models